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What is the probability of a thermodynamical transition? 
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If the second law of thermodynamics forbids a transition from one state to another, then it 
is still possible to make the transition happen by using a sufficient amount of work. But if we 
do not have access to this amount of work, can the transition happen probabilistically? In the 
thermodynamic limit, this probability tends to zero, but here we find that for finite-sized systems, it 
can be finite. We compute the maximum probability of a transition or a thermodynamical fluctuation 
from any initial state to any final state, and show that this maximum can be achieved for any final 
state which is block-diagonal in the energy eigenbasis. We also find upper and lower bounds on 
this transition probability, in terms of the work of transition. As a bi-product, we introduce a 
finite set of thermodynamical monotones related to the thermo-majorization criteria which governs 
state transitions, and compute the work of transition in terms of them. The trade-off between the 
probability of a transition, and any partial work added to aid in that transition is also considered. 
Our results have applications in entanglement theory, and we find the amount of entanglement 
required (or gained) when transforming one pure entangled state into any other. 


I. INTRODUCTION 

Given a quantum system in a state p with some Hamil¬ 
tonian, Hi, when can it be deterministically transformed 
into another state a associated with a potentially differ¬ 
ent Hamiltonian, H 2 I If we can put the system into con¬ 
tact with a heat bath at temperature T, then in the ther¬ 
modynamical limit, and if interactions are short-ranged 
or screened, a transition will occur as long as the free 
energy of the initial configuration is larger than the free 
energy of the final configuration. The free energy of the 
state p defined as: 

F(p,H 1 )=tr[H 1 p\-TS(p), (1) 

were S(p ) is the entropy; S(p) = — trplogp. This is a 
formulation of the second law of thermodynamics, if we 
factor in energy conservation (the first law). If we wish 
to make a forbidden transition occur, then we need to 
inject an amount of work which is greater than the free 
energy difference between initial and final states. 

However, what if we are interested in small, finite¬ 
sized systems? Or in systems with long-range interac¬ 
tions? The thermodynamics of systems in the micro¬ 
regime, where we do not take the thermodynamical limit, 
has gained increased importance as we cool and manip¬ 
ulate smaller and smaller systems at the nano scale and 
beyond m- Theoretical work has continued a pace, 
with increased interest in the field in recent years [M32]. 
If we do not take the thermodynamical limit, then pro¬ 
vided cr is block-diagonal in the energy eigenbasis, there 
is not just one criteria (the decreasing of the free en¬ 
ergy), but a family of criteria which determine whether 
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a state transition is possible. A set of such criteria which 
have been proven to be necessary and sufficient con¬ 
dition for quantum thermodynamical state transforma¬ 
tions m (c.f. 0), are the so-called thermo-majorization 
criteria mm- Thermo-majorization is a set of condi¬ 
tions that are more stringent than the ordinary second 
laws and had been conjectured to provide a limitation 
on the possibility of thermodynamical transformations 
since 1975 [7j. It is related [ana to a condition known 
as Gibbs-stochasitv [34l [35] a condition which can be ex¬ 
tended to include fluctuations of work (35|. 

Once again though, if the diagonal state a, is not 
thermo-majorized by p, then a transition is still pos¬ 
sible, provided sufficient work is used. One can com¬ 
pute the work required (or gained) from this transi¬ 
tion using thermo-majorization diagrams WL via a lin¬ 
ear program [33], or the relative-mixedness [24|. Suppose 
however, we want to make a transition from p to cr, and 
it requires work which we cannot, or do not wish to, ex¬ 
pend. Can we still nonetheless make the transition with 
some probability p rather than with certainty? And if so, 
what is the highest probability, p *, that can be achieved? 
In particular, given p and cr, we are interested in maxi¬ 
mizing p in the following process: 

p — >p' = pa + (1 - p) X, (2) 

with X being some arbitrary state. 

Such a transformation can be regarded as a fluctuation 
of a system’s state, in the sense that the transformation is 
only probabilistic. Within the study of thermodynamics 
for small systems, great progress has already been made 
in analyzing how the work distribution associated with a 
given transformation of process can fluctuate fTTTTPTTTl (see 
gMl for reviews on both the classical and quantum 
cases). Fluctuation relations such as the Jarzynski equal¬ 
ity m and Crooks’ theorem [38], developed under the 
paradigm of stochastic thermodynamics, have been used 
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to calculate the work fluctuations of non-equilibrium pro¬ 
cesses. Investigating fluctuation in a system’s state pro¬ 
vides a natural, complementary strand of research which 
we are able to formulate and analyze in this paper by 
applying techniques from quantum information theory 
developed in [20] - In related work [36], we shall address 
the problem of fluctuating work within this information 
theoretic framework. This shall serve to unify the two 
approaches to thermodynamics for small systems and ex¬ 
tend and provide insight into previous work based on the 
stochastic thermodynamics perspective. 


Here, we will upper bound the maximum probability 
of a fluctuation between any given p and a. When a 
is block-diagonal in the energy eigenbasis, we will show 
that this bound can be achieved and furthermore, that 
there exists a two outcome measurement that can be per¬ 
formed on p' such that we obtain a with the maximum 
probability p*. Of course, measurements do not come 
for free in thermodynamics - it costs work to erase the 
record of the measurement outcome [U] ■ That this mea¬ 
surement can be performed is noted for completeness - 
however, we take Eq. § as our primary goal, defining 
what we mean by a thermodynamical transition. We will 
discuss measurements in Section |III B| as they only pro¬ 
vide a small correction of kT log 2 to the work cost of a 
probabilistic transformation. 


Our main result will be Theorem [5j which upper 
bounds the probability p* in terms of a minimization over 
a finite set of ratios between thermodynamical mono- 
tones, which are quantities that can only decrease un¬ 
der the set of allowed operations. When the final state 
is block-diagonal, this bound is achievable, but this may 
not be the case if the final state has coherences in en¬ 
ergy. These monotones, which we will show are given by 
Eq. (411, can be thought of as analogous to free ener¬ 
gies.This is proven in Theorem [4] and is equivalent to the 
thermo-majorization criteria of [7j [20]. The set of ratios 
that we use to bound p* thus gives an alternative way of 
verifying if the thermo-majorization criteria are satisfied. 
Rather than considering the thermo-majorization curves 
[20] or considering a continuous set of monotones [21] we 
provide a finite set of conditions to check. Indeed this 
set provides a strengthening of results from the theory 
of relative majorization |45] 14.B.4(c)] by reducing the 
number of constraints that need to be considered. 


Before proving Theorem [5] we will consider in Section 
[TT] the simpler case where the Hamiltonian of the system 
is trivial, i.e. H oc I. Solving the problem in this regime, 
referred to as Noisy Operations [T51 H5], will provide us 
with insight into the solution for non-trivial Hamiltoni¬ 
ans. In this simplified situation, p* is given by Theorem 
[lj The result is similar in form to |47| which considers 
the analogous problem of probabilistic pure state entan¬ 
glement manipulation using Local Operations and Clas¬ 
sical Communication (LOCC). However, care must be 
taken - the class of operations allowed under LOCC is 
very different to what is allowed in thermodynamics. For 
example, under LOCC one can bring in pure states for 


free (which can be a source of work in thermodynam¬ 
ics) and one is allowed to make measurements for free 
(which costs work). Perhaps more importantly, many of 
the LOCC monotones are concave, which is not the case 
in Noisy Operations, thus we will require some different 
techniques. It should also be noted that in entanglement 
manipulation, the maximum probability achievable will 
be zero if the target state has a larger Schmidt rank than 
the starting state. Under Noisy Operations, we will see 
that p* is always non-zero (though it can be arbitrarily 
small). 

In Section III we consider the general case of arbi¬ 
trary initial and final Hamiltonians and states. We will 
prove our results using the paradigm of Thermal Opera¬ 
tions (TO) [20] HU [55] . There are a number of different 
paradigms one can use to study thermodynamics (e.g. al¬ 
lowing interaction Hamiltonians or changing energy lev¬ 
els), however, these other paradigms are equivalent to 
Thermal Operations mm, and thus Thermal Opera¬ 
tions are the appropriate paradigm for studying funda¬ 
mental limitations. We introduce Thermal Operations at 
the beginning of Sect ion [LH| In the case of a trivial Hamil¬ 
tonian, Thermal Operations reduce to Noisy Operations, 
the regime considered in Section [II] 

Our expression for the cost of a transition between any 
two states using only a finite number of monotones is 
given in Lemma [2] for Noisy Operations and Lemma [6] for 
Thermal Operations. The Noisy Operations result can 
be adapted to give an expression for the amount of en¬ 
tanglement required (or gained) when transforming any 
pure bipartite state into another under LOCC. This is 
given in Appendix [A] and generalizes existing expressions 
for the distillable entanglement USED] and cost of en¬ 
tanglement formation m ■ We also show how p* can be 
upper and lower bounded using the work of transitions 
from p to a and a to p. This is done in Lemma [3| for the 
case of a trivial Hamiltonian, and in Lemma ^ for the 
general case. 

Finally, we conclude in Section |IV| with a discussion 
on other goals, related to Eq. ([2]), which one could at¬ 
tempt when making a probabilistic transition. One such 
goal, the optimization of the heralded probability , is dis¬ 
cussed in detail in Appendix [B] where we obtain bounds 
on it, even in the presence of coherence or catalysts. The 
heralding probability can be thought of as a generaliza¬ 
tion of the case where one achieves Eq. (|2j) with a mea¬ 
surement i.e. 


p® |0)(0| p = pa (g> |0)(0| + (1 -p) X <g> |1)(1|. 

and the transition is conclusive. This allows us to ana¬ 
lyze state fluctuations in the presence of measurements, 
coherence and catalysis. We also pose some open ques¬ 
tions. One of these regards how p* varies if we supply 
additional work to drive the transition from p to a or 
demand that additional work be extracted. The solu¬ 
tion for qubit systems with trivial Hamiltonian is given 
in Appendix |C[ 
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II. PROBABILITY OF TRANSITION UNDER 
NOISY OPERATIONS 

Before investigating Eq. © in the context of Thermal 
Operations, we will first consider a simpler, special case - 
Noisy Operations. In this particular instance of thermo¬ 
dynamics, the Hamiltonian of the system under consid¬ 
eration is trivial. Noisy Operations were first defined in 
m where the problem of whether a transition between 
two given states under a particular set of operations was 
considered. Within Noisy Operations, the following ac¬ 
tions are allowed: i) a system of any dimension in the 
maximally mixed state can be added, vi) any subsystem 
can be discarded through tracing out and in) any uni¬ 
tary can be applied to the global system. Throughout 
this paper, we shall use rp to denote the eigenvalues of p 
and Q to denote those of a. For a comprehensive review 
of Noisy Operations, see [46] . 

Given two states, p and a, it was shown in [1'Sj that 
transition from p to a is possible under Noisy Operations 
if and only if p majorizes a (written p a). That is, if 
we list the eigenvalues of p and a [52] in decreasing order 
and denote these ordered lists by rf = {rft,... , r] n } and 
C = {Ci, • • •, Cn} respectively, the transition is possible if 
and only if: 

Vi(p)>Vi(a), VZ e {1,. .., n}, (3) 

where: 

i 

V l(p) ='52 r li- ( 4 ) 

i=l 


2 ~ Io ° ((T) 
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FIG. 1. Lorenz Curves, a) The Lorenz curve for p is defined 
by plotting the points: Vi^j } ■ b) The transition 

from a to p is possible under NO as the curve for a is never 
below that of p. c) The Lorenz curve for a maximally mixed 
state is given by the dashed line from (0,0) to (1,1). All other 
states majorize it. d) s log 5 is an example of a sharp state, e) 
s ioo(v) i s the least sharp state that majorizes <7. 

p*, that can be achieved. A similar problem is consid¬ 
ered in SZI for entanglement manipulation and adapting 
its techniques the following theorem can be shown: 

Theorem 1. Suppose we wish to transform the state p 
to the state a under Noisy Operations. The maximum 
value of p that can be achieved in the transition: 


Lorenz curves are a useful tool for visualizing these cri¬ 
teria (Figure [lj. For a given state p, its Lorenz curve is 
formed by plotting the points: 



and connecting them piecewise linearly (together with 
the point (0, 0)) to form a concave curve. If p majorizes 
(7, the Lorenz curve for p is never below that of a. 

The functions defined in Eq. Q, and their analogue 
in Thermal Operations, will be crucial for the rest of the 
paper. They are monotones of the theory, only decreasing 
under Noisy Operations. Excellent reviews regarding the 
theory of majorization and Lorenz curves can be found 
in [45j 06] . 

A. Non-deterministic transitions 

We will now consider transitions when the conditions 
given in Eq. ([3| are not necessarily fulfilled. Here, rather 
than transforming p to a with certainty, we shall do so 
with some probability as formulated in Eq. ©■ In par¬ 
ticular, we are interested in the maximum probability, 


P 


NO 


p' = pa + (1 - p) X, 


is given by: 


* • V5 (p) 

p = mm ———. 

Ie{i,...,n} Vi (a) 


( 6 ) 

(7) 


Proof. The proof is split into two parts: first we apply 
Weyl’s inequality and the definition of majorization to 
derive a contradiction if it were possible to achieve a value 
of p large than p*. Next, we adapt the techniques of [47] 
to provide a protocol achieving p = p*. 

To achieve our first goal we begin by showing that 
given Eq. ©: 

V) (p) > vVi (*), VZ. (8) 


To prove this, we make use of Weyl’s inequality 
[54] . Given n x n Hermitian matrices, A , B and C such 
that A = B + C, let {oj}" =1 , {6j}" =1 and {ci}" =1 be 
their respective eigenvalues arranged in descending order. 
Weyl’s inequality then states that: 


bi T Cn 'A. ai A hi -)- Ci, (9) 

for all i. Applying this to p' , cr and X , we obtain: 

Vi > P(i + (1 ~p)x n , Vi, 


( 10 ) 
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where x n is the smallest eigenvalue of X. As X is a 
positive semidefinite matrix, x n > 0 and: 

Vi > P(i, Vi. (11) 

Hence: 

i i 

Vi (, p) >V l {p')=Y J V'i>pY.ti= pVi W , (12) 

2=1 2=1 


lo < li <...< Ik = n and a set of r* such that p* = 
1) # 

Now we split p and a into blocks and define: 


Pi = diag (vh-i+i, ■■■, Vk) , (18) 

o-j =diag(0 i _ 1+ i,...,0 i ) • (19) 


Then from Eq. (15) (and the fact that equality occurs 
when l = h), pi majorizes and we can perform: 


where the first inequality uses Eq. ([3]) and the second 
follows from Eq. fll] ). 

Now suppose it was possible to achieve a value of p 
greater than p* in Eq. ([7]). Then there would exist an l 
such that V\ (p) < pVi (a), contradicting Eq. 

To show that p* is obtainable, we define the following 
quantities. First, define /1 by: 


l\ = max 



Vijp) 

Vi(a) 


= p* = 7V 1 ) 


(13) 


Then we proceed iteratively and, provided k-i < n, de¬ 
fine: 


m ^ (p) - ( p) 

so we have: 

1 1 

r (i) E (j < E 
j=U~ 1+1 j—U- 1+1 


(14) 


(15) 


Define k by: 


li = max 


:l > k-l 1 


Vl ( P) ~ Vlj-! ( P) 
Vi (a) - V li _ 1 (a) 



(16) 


Note that we have > r V To see this, first 
observe that for a, b,c,d> 0: 


7 < 1 ^ 7 < 7 - 

b b + d b d 

Setting: 

a = V u _ 1 (p) - V u _ 2 (p ), 
b = V u _ x (a) - V u _ 2 (a ), 
c = V h (p) - V u _ 1 (p ), 
d = V h (a) - Vi z _ 1 (a ), 


(17) 


so | = 1 1 and | = rk\ then: 

a + c = Vn (p) - U,_ 2 (p) (i _ x) _ a 

b + d V h (a) - V h _ 2 (a) T b ’ 


where the inequality follows from the definition of 
Using Eq. ( p~7] ) , the claim that r 1 -*) > r ( 1 -!) now follows. 
Overall, this protocol generates a set of k such that 0 = 


Pi ^4 r (i Vi = p*<Ji + (r (l) - p*J (jj, V*. (20) 


With a bit of massaging and recombining the blocks, this 
is the same form as Eq. (|6|) with p = p* and the blocks 
of X being defined by: 


v -p\ 

-X-i — n * &%• 

1 — p* 


( 21 ) 


O 


Note that as the endpoints of the Lorenz curves co¬ 
incide at (1,1) and > 0, we are guaranteed that 

0 <p* < 1. 

If we want to obtain a from p with probability p* rather 
than have it as part of a probabilistic mixture as per Eq. 
([6]), we can do so by performing a two outcome mea¬ 
surement, with measurement operators { \/M, y/T-^M }, 
where the blocks of M are given by: 

M * = diag(^y,.--,| i y) ■ (22) 

To see that M is a valid measurement, we note that in 
general 0 < < 1. Hence both { \[M , y/I — M} are 

well defined, and their squares trivially add up to the 
identity. 

After applying this measurement to p' and reading the 
result, we will have either: 

VW p'\fw = p*a, (23) 


or 


\/(I ~M) = (l-p*)X. (24) 

However, performing this measurement is outside of the 
class of Noisy Operations and hence costs work. As such, 
if a general two outcome measurement is allowed with¬ 
out taking its cost into account, it can be possible to 
transform p into a with probability greater than p *. For 
example, if p and er are qubits, we can convert p into a 
with certainty using this extra resource. Firstly we add 
an additional qubit in the maximally mixed state and 
measure it in the computational basis. This results in a 
pure state, either |0) or |1). As these majorize all other 
qubit states we can use it to obtain any a with certainty. 
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B. Nonuniformity of transition under Noisy 
Operations 

If it is not possible to deterministically convert p into 
a using Noisy Operations, to perform the transformation 
with certainty will cost some resource, in the form of 
nonuniformity. For instance, if we add some pure states 
of sufficiently high dimension, a previously impossible 
transition will become possible. Adding these additional 
pure states can be thought of as the analogue to adding 
work. Similarly, if p can be converted into a using Noisy 
Operations, it may be possible to extract some nonunifor¬ 
mity (e.g. by transforming some maximally mixed states 
into pure states). This is the analogue of extracting work. 
More generally, we shall extract or expend the equiva¬ 
lent of work using sharp states. These sharp states, as 
discussed in the next subsection, will serve as a natural 
unit for the nonuniformity resource. We will compute 
the nonuniformity of transition in terms of a finite set of 
ratios of monotones. This is done in a similar manner 
to [46j . although we show that the minimization can be 
done over fewer points. 


1. Sharp States 


L y ( p ) where 0 < y < 1, can be defined as the shortest 
horizontal distance between the Lorenz curve of p and 
the y-axis at y. Note that these functions never decrease 
under Noisy Operations. In particular: 


rC ^ 

L Vk (p) = —» for yk = 2_ / r]i, l<k< rank (p) 


Li ( P) = 


rank (p) 


(27) 


If we define the set V (a) by: 


nk(cr) 


(»=1 J fc=1 


(28) 


then a transition from p to a is achievable with certainty 
under Noisy Operations if and only if: 


Ly(p)<Ly(a), Vy£P(a). (29) 

That it is sufficient to consider only y £ T> (a) will be 
justified below. 

The horizontal monotones, L y , also allow us to quan¬ 
tify the optimal work of transition that is required or 
extracted in going from p to er: 


Quantifying the optimal amount of work of transition 
for the more general Thermal Operations was considered 
in Ennui- We shall denote the Noisy Operations equiv¬ 
alent of work, the nonuniformity of transition, by w- 
If nonuniformity must be added, the quantity is negative, 
while if we can extract nonuniformity, it will be positive. 
For \I p -tcr\ = fogy, we define an associated sharp state 
0H by: 

s\i p -Hr | =diagQ,..., j,0,...,0^. (25) 

3 d-j 

Appending a sharp state I log d to the system is equivalent 
to introducing log j units of nonuniformity. See Figure 
[l] for an example of a sharp state’s Lorenz curve. The 
state is such that: 

P® s IWl ^ er, if Ip^HT < 0, 

1 VQ „ .n r . n 

p -> CT®S| J/( _^|, if I p ^ a > 0. 

In terms of Lorenz curves, tensoring a state p with a 
sharp state sj has the effect of compressing the Lorenz 
curve of p by a factor of 2~ J with respect to the a;-axis 

m- 


2. Monotones for Noisy Operations and the nonuniformity 
of transition 

The function V) (p) is equal to the height of the Lorenz 
curve of p at x = A An alternative set of monotones, 


Lemma 2. Given two states p and a, under Noisy Op¬ 
erations: 


2 Ip ^° = max (?) 


yeV(a) Ly ( a) 


(30) 


Proof. To prove this, we make use of the geometrical 
structure of Lorenz curves and the properties of w 
Note that we have: 


2 Ip ^ a = max ^ 


J.J.ACWV / \ ? 

2/6[0,1] Ly (a) 


(31) 


as this follows from the fact that to obtain the optimal 
value of w, we wish to rescale the Lorenz curve of 
p with respect to the £-axis in such a way that it just 
majorizes that of a - the curves should touch but not 
cross. The amount that we need to rescale by is given by 
Eq. m. 


We now show that it is sufficient to maximize over y £ 
V (a). Let so = 0 and Sk = l C* f° r 1 < k < rank (a). 
Then, for 1 < j < rank (er), as the Lorenz curve of a is 
a straight line on the interval [sy_i, Sj] and the Lorenz 
curve of p is concave: 


L 

max 
ye[sj-i,Sj] L. 


y (p) ^ rL sj-x (p) + (!-»■) L Sj (p) 

< max 




re [0,1] 


J -1 


+ ( 1-0 


(32) 


It is straightforward to check that the maximum value 
occurs at either r = 0 or r = 1. We can thus replace 
the inequality in Eq. (32) with an equality and it follows 
that it suffices to maximize over y £ T> (a). □ 
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As p cr is possible if and only if I p ^, a > 0, the 
finite set in Eq. (29) is justified. 

Note that in [46] it was shown that it is possible to 
calculate w by performing an optimization over the 
ratios calculated at the ‘elbows’ (see Figure [I] for a defi¬ 
nition) of both p and cr. In Lemma[2]we have shown that 
it suffices to consider just the ‘elbows’ of a. 


3. Bounds on the transition probability 

The quantities I p ^ a and / CT _>p can be used to bound 
p* as follows: 

Lemma 3. Given two states p and a, under Noisy Op¬ 
erations: 


<p* < 2 


t (7 


where as p* < 1, we assume I p ^ a < 0. If I p ^, a > 0, 
p* = 1 and the transformation from p to a can be done 
deterministically, potentially extracting a finite amount 
of nonuniformity. 

Proof. We start proving with the lower bound, giving a 
protocol which achieves p = 2 Wp . The upper bound 
is derived by considering properties of the purity of the 
least sharp state that majorizes p. 

Assuming |Wp_^ CT | = logy for simplicity, and defining 
Id to be the maximally mixed state of a d level system: 


NO „ „ 

P -tp® Id, 


(34) 


= d Ri 


’log 1 


d — j 


a a 


NO J 

d a 


d~j 


Tr B Y, 


where Y is the state obtained by applying the second 
Noisy Operation to p ® s, _<l_ ■ Using this protocol, we 

® d — j 

obtain something of the form Eq. (I6| with p = 2 Ip ^ a and 
X = Tr bY. As p* is the maximum value of p obtainable 
in Eq. § , we derive the lower bound. 

We now consider the upper bound and to obtain a 
useful bound, assume I a ^r P > 0. We define /oo(p) as 
the nonuniformity of formation of p under NO[l8|. given 
by loo(p) = - log rim, and hence let sj ( p ) be the least 
sharp state that majorizes p (see Figure^). Note that Too 
decreases under Noisy Operations and is additive across 
tensor products {U>1. In terms of the eigenvalues of p and 


ct : 


S /oo(p) — S l0g(?7l7i)j 

s bx,0) = S log(Ci")- 
By definition, as I a ^ P > 0: 


no 

cr —>• p ® s/ CT 


(35) 


(36) 


Now, using first the monotonicity of I aa and then the 
additivity: 

loo (ct) > loo (p ® , (monotonicity) 

= loo (p) + I <7 —¥p ■ (additivity) 

I(j^r p ^ loo (ct) loo (p) ? 

= log (Cm) - log (pin), 

'Cl' 


= log 


2> df 

~ Ci’ 


m 


Vi(p) 

Vi ( ct ) * 

> p *, (by definition) 


(33) 

as required. 


□ 


From Eq. (|33|) we can see that when I p ^, c , = —I a -> P 


I (that is, in a reversible transition) thenp* = 2 1 . This 

occurs when either cr ^ p®S|/| or p ^ cr®S|/| depending 
on whether / is positive or negative (when I > 0 the 
transition is deterministic). In terms of Lorenz curves, 
this means that the curves of p and cr have the same 
shape up to re-scaling by a factor 2 _/ . In particular, this 
is the case when both p and cr are sharp states, where 
both Lorenz curves are straight lines. 

This result can be applied in the thermodynamic 
regime of many independent copies. If we want to per¬ 
form a transition such as: 




®iV 


(37) 


we need an amount of work given by —NI p ^. a . Hence, 
the probability of success in such a case is bound by: 




<p* < 2 


-NI„ 


(38) 


which tends to 0 for large N. This can be seen as a way in 
which in the thermodynamic limit statistical fluctuations 
are suppressed. 


4. Lorenz curve interpretation 

In terms of Lorenz curves, adding w nonuniformity 
to p to make the transition possible is equivalent to com¬ 
pressing the Lorenz curve with respect to the i-axis by a 
ratio 2 ~ Ip ^”, such that the curve of p lies just above and 
touches that of a. Hence, a compression by p* > 2~ Ip ^ a 
must mean that there is at least a point of the compressed 
curve just below or touching cr. A proof of this is given 
in Figure [2j 

Extracting I a ^ P nonuniformity from a before perform¬ 
ing NO into p is equivalent to compressing the curve of 
p by a ratio of 2 -/<T -> P such that the curve of a lies just 
above and touches that of p. Hence, to prove the upper 
bound in Eq. (33), it suffices to show that in compressing 
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6 3 2 3 6 


FIG. 2. We plot the curves of p, a and p compressed by p* 
(with respect to the 2 -axis). The points A and B at which 
the vertical ratio between the curves of p and a is maximum 
(which sets li and p*), and the sharp states that pass through 
those points are also shown as dashed lines. After compressing 
the Lorenz curve of p by a ratio ofp*, the point B will be taken 
to C, which will always either be below the curve of a or just 
touching it. This proves the lower bound in Eq. (331. 



6 3 2 3 6 


FIG. 3. We plot the curves of p, a and p compressed by p* 
(with respect to the 2 -axis). The points A and B at which 
the vertical ratio between the curves of p and a is maximum 
(which sets l\ and p*) and the sharp states 1^ ( p ) and 1^ (a) 
are also shown as dashed lines. Given that for sharp states 
all bounds are saturated, the appropriate maximum vertical 
and horizontal ratios coincide, and are pi/Ci, the ratio of the 
heights of B' and A'. But this ratio is, by definition, bigger 
than or equal to p*, the ratio between A and B. This means 
that if the curve of p is compressed by p *, the point B' is 
mapped to C just above or touching the curve of a, proving 
the upper bound of Eq. (331. 


the curve of p by p* at least one point of the new curve 
must lie above or touch that of a. In Figure [3] we show a 


diagrammatic version of the proof given in Section IIB 


It should be noted that with Lemma [3] we are proving 
a general statement about convex Lorenz curves. This is, 
that the minimum vertical ratio of two given curves (p*) 


is lower and upper bounded respectively by the minimum 
and the maximum horizontal ratio of the two. 


III. PROBABILITY OF TRANSITION UNDER 
THERMAL OPERATIONS 

Noisy Operations can be generalized to include systems 
with arbitrary, finite Hamiltonians. This is the resource 
theory of Thermal Operations [201 EH 35 , 08]. Within 
this scheme, the allowed operations are: i) a system 
with any Hamiltonian in the Gibbs state of that Hamil¬ 
tonian can be added, ii) any subsystem can be discarded 
through tracing out and in) any energy-conserving uni¬ 
tary, i.e. those unitaries that commute with the total 
Hamiltonian, can be applied to the global system. These 
operations model the thermodynamics of a system in the 
presence of an ideal heat bath [20103 • Note that while 
the heat bath the system is in contact with is assumed to 
be large, thermal operations include processes that only 
interact with a small part of the bath. As such, limita¬ 
tions derived with respect to such an idealized bath can 
be regarded as truly fundamental. Even though the bath 
size can be large, the system of interest is fixed, and can 
for example, be only a single system. They thus describe 
processes beyond the thermodynamic limit. 

In general, the initial and final systems may have dif¬ 
ferent Hamiltonians but, by making use of the ‘switching 
qubit’ construction in [20 j . we can w.l.o.g. assume that 
the initial and final Hamiltonians are the same. As such, 
the results in this section will assume this but in Sec¬ 
tion |IIID| we will discuss how a changing Hamiltonian 
affects them. In Appendix H of 08] it was shown that 
other mainstream thermodynamical paradigms such as 
time dependent Hamiltonians, the insertion of interac¬ 
tion terms between system, bath and work systems and 
various master equations are all included within the scope 
of Thermal Operations. 

In the absence of catalysts, and provided the final 
state is block-diagonal in the energy eigenbasis, it was 
established in m that a transition from p to a is pos¬ 
sible under Thermal Operations if and only if p thermo- 
majorizes a. This is similar in form to the majoriza- 
tion criteria of Noisy Operations and can be visualized 
in terms of thermo-majorization diagrams which are sim¬ 
ilar to Lorenz curves but with two crucial differences. 

Suppose p is also block-diagonal in the energy eigen¬ 
basis with eigenvalue pi associated with energy level £), 
for 1 < i < n. Firstly, rather than ordering according 
to the magnitude of r/i, we instead /3-order them, listing 
Pie^ Ei in descending order. 

The second difference is that we no longer plot the /?- 
ordered p t at evenly spaced intervals. Instead we plot the 
points: 





(p) 


-E 


» 



(39) 













Similarly to how Eq. Q defines monotones for the 
Noisy Operations resource theory, the height of the /3- 
ordered thermo-majorization curves provides monotones 
for Thermal Operations. If we denote the height of 
the thermo-majorization curve of p at x by V x (p), for 
0 < x < Z (where Z is the partition function), then 
by the thermo-majorization criteria, this function is non¬ 
increasing under Thermal Operations. In particular, for 
block-diagonal p, we have: 

V Xk (p) = Vi >] > where x k = Yl e ~^ E * P) ■ ( 41 ) 

t= 1 2=1 


FIG. 4. We show the /3-ordered thermo-majorization dia¬ 
grams for various states of the system. Note that different 
states may have different /3-orderings and the markings on the 
x-axis correspond to one particular /3-ordering. The curves al¬ 
ways end at (Z, 1). The thermo-majorization criteria states 
that we can take a state to another under Thermal Operations 
if and only if the curve of the initial state is above that of the 
final state. Hence, in this case (provide p is block-diagonal in 
the energy eigenbasis) there is a set of operations such that 

(j p, but not for the reverse process. 


where the superscript p on Ei and r/i indicates that they 
have been /3-ordered and this ordering depends on p. 
Thermo-majorization states that p can be deterministi¬ 
cally converted into a block-diagonal cr if and only if its 
thermo-majorization curve never lies below that of cr, as 
is shown in Figure [4] This is analogous to the case of 
Noisy Operations. In what follows, we assume that the 
Pi have been /3-ordered unless otherwise stated. 

If p is not block-diagonal in the energy eigenbasis, 
to determine if a transition is possible we consider 
the thermo-majorization curve associated with the state 
formed by decohering p in the energy eigenbasis. This 
state, pd, is given by: 

n 

p D =Y l \Ei)(E i \p\E i )(E i \, (40) 

2=1 


where | Ei) is the eigenvector of the system’s Hamilto¬ 
nian associated with energy level Ei. The operation of 
decohering p to give pu is a Thermal Operation and com¬ 
mutes with all other Thermal Operations [55]. A transi¬ 
tion from p to cr, where a is block-diagonal in the energy 
eigenbasis, can be made deterministically if and only if 
the thermo-majorization curve of po is never below that 
of cr. 

Finally, if a is not block-diagonal, a transition from 
p to a is possible only if po thermo-majorizes <jd and 
finding a set of sufficient conditions is an open question. 

In what follows, the thermo-majorization curve of 
a state with coherences is defined to be the thermo- 
majorization curve of that state decohered in the energy 
eigenbasis as per Eq. (401. 


These monotones also give us an alternative way of stat¬ 
ing the thermo-majorization criteria: 

Theorem 4. Suppose a is block-diagonal in the energy 
eigenbasis. Let C (cr) = |X]i='j • Then p 

can be deterministically converted into a under Thermal 
Operations if and only if: 


K (p) > K (a ), VxeC(a). 


(42) 


Proof. To prove this theorem, we make use of the concav¬ 
ity properties of thermo-majorization curves. Suppose 
p —> a. Then by thermo-majorization, V x (p) > V x (cr), 
for 0 < x < Z and in particular Eq. (421 holds. 


Conversely, suppose Eq. (42) holds and, setting t 0 = 0, 


label the elements of C (cr) arranged in increasing order 
by ti for z = 1 to n. Then on the interval for 

1 < z < rz, the thermo-majorization curve of cr is given by 
a straight line. From p , define the block-diagonal state 
p a by the thermo-majorization curve: 


{ M »)}1 


(43) 


and note that due to the concavity of thermo- 
majorization curves, p thermo-majorizes p a - On the in¬ 
terval 1 < i < 7Z, the thermo-majorization curve 

of pa is also given by a straight line. The construction of 
this state p a is shown in Figure [5j 


As V ti ( pa ) = V ti (p), Vz by construction, Eq. (42) 
implies that Vt i (p a ) > V), (cr), Vz. Hence on the interval 
1 < i < n, the thermo-majorization curves for 
Pa and cr, and therefore p and cr, do not cross. As this 
holds for all z and the intervals cover [0, Z) the thermo- 
majorization curve of p is never below that of cr and we 


TO 


can perform p —> a deterministically. 


□ 


If we define the number of ‘elbows’ in the thermo- 
majorization curve of cr to be j , this reduces thermo- 
majorization to checking j criteria and generalizes 
Lemma 17 of [46J to Thermal Operations. Note also that 
if cr is not block-diagonal in the energy eigenbasis, Eq. 
(42) gives a necessary but not sufficient condition for the 


transition from p to a to be possible. 
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can also be achieved when attempting to convert p into 
a D - 

TO 

P —» p' =ptJ + (l-p)X, 

decohere 

-> p'd = P a D + (1 - p) X D . 


Thus, to upper bound p*, it suffices to show that Eq. 
(451 holds for block-diagonal a. Furthermore, w.l.o.g. 
we can assume that p' and X are also block-diagonal. 
Using Weyl’s inequality as per Theorem [T] to deal with 
degenerate energy levels, for block-diagonal p'. a and X , 
we have: 


FIG. 5. Here we illustrate the construction of the state p a 
used in the proof of Theorem [4] The points of the curve p 
that are at the same horizontal position as the elbows of a are 
joined, and by concavity the resultant curve is always below 

P- 


A. Non-deterministic transformations 


Vi > P(i, Vi. 


(47) 


Now consider the sub-normalized thermo-majorization 
curve of pa given by the points: 


C 




(48) 


k—1 


Having defined the appropriate monotones for Ther¬ 
mal Operations, we are now in a position to investigate 
non-deterministic transformations and prove a theorem 
analogous to Theorem |T| 

Theorem 5. Suppose we wish to transform the state p 
to the state a under Thermal Operations. The maximum 
value of p, p*, that can be achieved in the transition: 

P ^ p' = PC + (1 - P) X, (44) 


is such that: 


and the (possibly non-concave) curve formed by plotting 
the eigenvalues of p' according to the /3-ordering of a. 
This is given by the points: 



k 



2=1 


n 


k =1 


(49) 


By Eq. (47), the curve defined in Eq. (491 is never below 
that defined in Eq. (48). 

Finally, the thermo-majorization curve of p' is given 
by: 


* . . V x (p) 

p < mm —-. 

xgC{<j) V x (a) 


(45) 



2=1 



(50) 


k=1 


Furthermore, if a is block-diagonal in the energy eigen- 
basis, there exists a protocol that achieves the bound. 


Proof. Proving this result is more complicated than prov¬ 
ing Theorem [l] due to the fact that p and a may have dif¬ 
ferent /3-orderings. We proceed as before, first showing 
the bound in Eq. (45) and then giving a protocol that 


achieves the bound when er is block-diagonal. 


We prove the bound in Eq. (45) by constructing useful 


intermediate curves between those of p and pa to deal 
with differing /3-orders. With these in place, the result 
will follow in a similar manner to Theorem [lj 


We begin by showing that given Eq. (44): 


Vx (p) > pV x (a ), \/x G [0, Z\. 


(46) 


First consider (for general a) the maximum value of p 
that can be achieved in attempting to convert p into er. 
As decohering is a Thermal Operation, this value of p 


Note that attempting to construct a thermo-majorization 
curve for p' with respect to the /3-ordering of another 
state, as we do in Eq. (49), has the effect of rearrang¬ 
ing the piecewise linear segments of the true thermo- 
majorization curve. This means that they may no longer 
be joined from left to right in order of decreasing gradi¬ 
ent. Such a curve will always be below the true thermo- 
majorization curve. To see this, imagine constructing a 
curve from the piecewise linear elements and in particu¬ 
lar, trying to construct a curve that would lie above all 
other possible constructions. Starting at the origin, we 
are forced to choose the element with the steepest gra¬ 
dient - all other choices would lie below this by virtue of 
having a shallower gradient. We then proceed iteratively, 
starting from the endpoint of the previous section added 
and choosing the element with the largest gradient from 
the remaining linear segments. The construction that we 
obtain is the true thermo-majorization curve. A graphi¬ 
cal description of this proof is shown in Figure [6] 
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FIG. 6. Here we show graphically the steps of the proof of the 
first part of Theorem [5] In the decomposition of Eq. ( |44| ) the 
curve pa must always be below that of p' and hence also p. 
This sets the maximum probability p* as defined in Eq. (451. 
Both pa and the disordered p' have the same /3-ordering. 


As such, the curve in Eq. (50) is never below that in 
Eq. (|49|). This gives us: 


V x (p) > V x (,/) > pV x (a) 


(51) 


where the first inequality holds as, by definition, p 
thermo-majorizes p'. In particular we have: 

* , • ^ * (P) /r 0 \ 

p < nun ——-—. (52) 

xeC(a) V x (a) 

When a is block-diagonal in the energy eigenbasis, a 
protocol that saturates the bound is: 


TO 

P Pm 


TO 


V =P*fT + (1 ~P*)X, 


where p& was defined in Eq. (43) and is thermo- 
majorized by p. As p c , and a have the same /3-ordering 
and: 


% (P) _ Vx (Pa) 
Vx (ct) V x (ct) 


Vx £ £ (a ), 


(53) 


applying the same construction used in Theorem [I] gives 
a strategy to produce p' a that achieves: 


p = mm 


Vx(p) 
xeC(a) V x (a ) 


(54) 

□ 


B. Measuring whether the transition occurred 
under Thermal Operations 

For block-diagonal er, after obtaining p' through Ther¬ 
mal Operations we may apply the measurement defined 


by Eq. (22) to extract our target state with probabil¬ 
ity p*. This can be done through a process that uses 
an ancilla qubit system, Q , that starts and ends in the 
state |0) and has associated Hamiltonian, Hq = I 2 , a 
unitary that correlates the system with the ancilla and 
a projective measurement on the ancilla qubit. As the 
measurement operators are diagonal in the energy eigen¬ 
basis, we will find that the unitary is energy conserving 
and within the set of Thermal Operations. Furthermore, 
the ancilla that is used to perform the POVM can be re¬ 
turned back into it’s original state. Hence the only cost 
we have to pay is to erase the record of the measurement 
outcome itself. As is well known m , the cost of erasing 
the record is fcTlog2, although if one is repeating the 
process many times, then it is kTh(p*) with h(p*) the 
binary entropy h (p*) = —p* log p* — (1 — p*) log (1 — p*) 

ED- 

The unitary that we shall use is given by: 


TT - ( 'fM VT^~M\ 

SQ -Vm ) ’ 


(55) 


where M is defined as per Eq. (22). Note that Usq = 
E/gq. Its effect on the initial joint state is: 


Usq(p' ® |0)(0|)I/g Q , 

/ Vm \/i-m\ fp’ o\ f Vm vi - m\ 
lyi- m -Vm ) ^0 0) yvt^M -Vm ) ’ 

( VMp’VM Vm p W 1 - m \ 

\s/l-Mp'VM VI - Mp'VT^VMJ ’ 

/ p*a VMp'Vl — m\ 

\VT^Mp'VM (1 -p*)X )' 


If we now measure the ancilla in the computational ba¬ 
sis, the joint state will collapse to ct ® |0)(0| when the 0 
outcome is observed. This happens with probability p*. 
If the 1 outcome is observed, the joint state collapses to 
X <g> 11) (11 and this happens with probability 1 — p*. In 
addition, if the 1 outcome is observed, we can then apply 
a Pauli Z to the ancilla qubit to return it to its initial 
state. 

To see that Usq commutes with the total Hamiltonian 
and belongs to the class of Thermal Operations, first note 
that the total Hamiltonian is given by: 


Hsq = Hs <g> I 2 + In <8> I 2 • (56) 


The unitary trivially commutes with the second term so 
focusing on the first term, and noting that M and Hs are 
both diagonal matrices so commute, it is easy to check 
that: 


[Usq,Hs <S> I2] 


f Vm V^vWx fH s 0 \ 
\VT^M -Vm)\ 0 H s ) 

_ fH s 0 \ f Vm VT^mN 
^0 h s ) \vtxm -Vm)' 


= 0. 
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Hence [Usq, Hsq] = 0. 

Observe that this reasoning can be generalized to mea¬ 
surements with s outcomes [561 . Provided the measure¬ 
ment operators commute with Hs , the measurement can 
be performed using a s-level ancilla system with trivial 
Hamiltonian and a joint energy-conserving unitary. Such 
a measurement can be performed for free up to having to 
spend work to erase the record of the measurement out¬ 
come at a cost of kTlns. On the other hand, channels 
that are not composed of Thermal Operations (including 
some measurements characterized by non-diagonal oper¬ 
ators) can be seen as a resource [57]. 

C. Work of transition under Thermal Operations 

1. Work systems 

In general, if we want a transition p —> a to be pos¬ 
sible, work may have to be supplied. Alternatively, if a 
transition can be achieved with certainty, it may be pos¬ 
sible to extract work. For the thermodynamics of small 
systems, the concept of deterministic work (also referred 
to in the literature as single-shot or worst-case work) has 
been introduced mmm- 

Within the Thermal Operation paradigm, the optimal 
amount of work that must be added or gained can be 
quantified using the energy gap, W, of a 2-level system 
with ground state |0) and excited state | W) with energy 
W. The associated Hamiltonian is: 

H = W\W){W\. (57) 

The work of transition, W p ^. a , is such that: 

if < 0, 

p®|W^)<W^ CT |^a®|0)<0|, 
if W p ^ >0, ( j 

P®|0)<0| ™ a®\W p ^ a ){W p ^ a \. 

Defining work in such a way enables the quantifica¬ 
tion of the worst-case work of a process. When W p ^ a is 
negative, it can be interpreted as the smallest amount of 
work that must be supplied to guarantee the transition 
takes place. If it is positive, it is the largest amount of 
work we are guaranteed to extract in the process. As 
the work system is both initially and finally in a pure 
state, no entropy is contained within it and its energy 
change must be completely due to work being exchanged 
with the system. Given the energy-conservation law that 
Thermal Operations follow (equivalent to the first law), 
this idea of work automatically yields a definition of what 
heat is. In a given operation, the change in energy of 
work bit, system and heat bath must be zero, and hence 
we can straightforwardly identify heat as the change in 
energy of the heat bath, or minus the change in energy 
on system and work bit. 



FIG. 7. We show the thermo-majorization curves of a state 
to which a work qubit in one of two pure states has been 
tensored. Adding this work system takes Z —> Z (l + e~ /3w ), 
extending the x-axis. When we tensor with the ground state 
to form p ® |0)(0|, the curve is the same as for p alone, but 
when the excited state is tensored, there is a change in the 
energy levels of the /3-ordering, and as a result the curve of p 
is compressed by a ratio of e~^ w . 

As we illustrate in Figure [7] the effect of appending 
a pure state of work to p is equivalent to stretching the 
thermo-majorization curve by a factor of e ~^ w , and ten- 
soring by the corresponding ground state to cr does not 
change the curve [2U. In both cases the /3-order is pre¬ 
served, and the new curves will have a lengthened x-axis 
[0, Z (l + )]. These different stretchings can serve 

to place the curve of p just above that of a, in which case 
W will be the work of transition, in a similar way to the 
case of nonuniformity within Noisy Operations. 


2. Monotones under Thermal Operations, and the work of 
transition 

In Thermal Operations, the horizontal distance be¬ 
tween a state’s thermo-majorization curve and the y-axis 
is again a monotone for each value of y £ [0,1]. We 
denote these by L y and, as before, they never decrease 
under Thermal Operations. In particular, for block- 
diagonal p, we have: 

k k 

L Vk (p) = E e ~ 0E ' P) > for 2/fc = E ’ 1 < < rank (P )» 

2=1 2=1 

rank(p) 

Mp)= E ^ P) > 

2=1 

(59) 

where all sums have been properly /3-ordered. 

Similarly to Lemma [2] we have: 

Lemma 6. Given two states p and a, where a is block- 
diagonal in the energy eigenbasis, under Thermal Oper- 
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ations: 


e -?w. 


Ty (p) 

max -. 

yeT}(a) L y (a) 


Now, if we append an idealized weight with Hamiltonian 
Hw = Jjg dw u>|'uj)('w| as a work storage system initially 
in the state |0), by definition there exists a different set 
of thermal operations that extracts work from a 


The proof is near identical to that given in Lemma [2] 
for Noisy Operations and so we omit it here. 

If a is not block-diagonal, the right hand side of Eq. 
(601 lower bounds To see this, recall that de¬ 

cohering commutes with Thermal Operations, and hence 
if the transition p® |0)(0| —>■ a ® |W p _ > . CT )(Wp_ ) . (T | is pos¬ 
sible, so is p® |0)(0| — 0 -£) ® |ITp__ 'uj j, and hence 

Wp—Hr < Wp^,j D . 


3. Bounds on the transition probability 


cr® |0}(0| ^r® |W^ r )(W^ T |. (64) 

By linearity, applying this set of TO to r = p*a + (1 — 
p*)X yields: 

p*r 0 \ W a -t T )(yV a -} T \ + (1 —p*)X' sw , (65) 

where X' sw is some joint system-weight state, with the 
weight in some work distribution px(w). Note that this 
operation is applied on both system and weight, and does 
not need to conserve the thermal state of the system 
alone. The Jarzynski equality for this operation reads: 


We can prove a result analogous to Eq. 
thermal case: 


(331 for the 


p* e m^ + (l -p*)^ Px (w)e Pw = 1. (66) 

W 


Lemma 7. Given two states p and a, where a is block- 
diagonal in the energy eigenbasis, under Thermal Oper¬ 
ations: 


e PWp^° <p* <e t (61) 

where as p* < 1, we assume Hp_> t , <0. If Wp^a > 0, 
p* = 1 and the transformation from p to a can be done 
deterministically, potentially extracting a finite amount 
of work. 

Proof. The previous Lemma [3] can be seen as a general 
statement about pairs of concave Lorenz-like curves: the 
minimum vertical ratio is lower and upper bounded by 
the minimum and maximum horizontal ratios of the two. 
Given our previous definitions of the work of transition, 
and the fact that p* is the minimum vertical ratio of the 
two Lorenz curves (as shown in Theorem [5]), the result 
follows. 

□ 

The upper bound of Lemma [7] can be related to the 
Jarzynski equality, which is found to hold for general 
thermal operations applied to the system in an initial 
thermal state, (see [30] for further details). The equality 
states that, for a given thermal operation that extracts 
work w with some probability p ( w ), we have that: 

(e^} = ^V>H = 1. (62) 

W 

The Jarzynski equation is valid if the initial state is 
thermal, so let us take the special case of Lemma [7] of a 
process where we start with a thermal state r and proba¬ 
bilistically go to some a diagonal in energy, with optimal 
probability p*. Because r is the fixed point, the effect of 
that operation is trivial: 


r 


TO 


(63) 


The second term in this sum is positive, and hence we 
have: 


p*eP W ^ T < 1 , 


(67) 


which is the upper bound of Lemma [TJ 

Note that in situations where the upper bound is 
saturated (such as reversible processes with W a ^ T = 
—Wr-tcr , when the thermomajorization curve of a is also 
a straight line) the operation in Eq. (64) costs a diver¬ 
gent amount of work in the case of failure i.e. from the 
state X in Eq. (|63 ). 


D. Changing Hamiltonian 

Our results so far have assumed that p and a are as¬ 
sociated with the same Hamiltonian. Suppose the initial 
system has Hamiltonian Hi and the final system Hamil¬ 
tonian H 2 - Following [20], this scenario can be mapped 
to one with identical initial and final Hamiltonian, H , if 
we instead consider the transition between p® |0)(0| and 
a ® 11) (11 where: 

H = Hi® |0)(0|+H2® |1)(1|. (68) 

Note that the partition function associated with H is 
Z = Z\ + Z^ • 

The height of the thermo-majorization curve of p ® 

10) (01 with respect to H, is identical to that of p with re¬ 
spect to H\ on [0, Z 1 ] and equal to 1 on [Zi,Z], Similarly, 
the height of the thermo-majorization curve of a® 11)(1| 
is identical to that of a on [0, Z 2 ] and equal to 1 on 
[Z 2 , Z]. Hence by extending the definition of V x (p) so 
that V x (p) = 1 for x > Z 1 , we can readily apply Theo¬ 
rems [4] and [5] to the case of changing Hamiltonians. 

Note that as L y (p) = L y (p® |0)(0|) for 0 < y < 1 
(and similarly for a), changing Hamiltonians does not 
affect the results of Section IIII Cl 


p' = t = p*a + (1 — P*)X. 
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IV. CONCLUSION 


Here, we have introduced a finite set of functions 
which, like the free energy, can only go down in the re¬ 
source theory of Thermal Operations. We used these to 
compute the work of transition, and the maximum proba¬ 
bility of making a transition between two states. Finally, 
we saw that the work of transition between the two states, 
and vice-versa, can be used to bound the maximum prob¬ 
ability of making the transition. 

In maximizing the value of p in Eq. ([2]) to obtain p *, 
we have attempted to maximize the fraction of a present 
in a state obtainable from p. With access to a single 
two outcome measurement, a can also be obtained from 
p with probability at least p*. There are other measures 
that one could quantify in attempting to obtain a state 
that behaves like a. For example, one could consider the 
fidelity between a and a state reachable from p: 

Fto (p, ct) = max {f (p, cr) : p p| , (69) 


where F (p, a) 


tr 


\A TpoVp 


is the fidelity be¬ 


tween the two states. Investigating this problem is an 
open question, but note that for diagonal a we have 
Fto {p, cr)>F (p', a) > y/p*. 

Another alternative would be to consider heralded 
probabilistic transformations. Here a 2-level flag system 
with trivial Hamiltonian and starting in the state |0) is 
provided with the initial state p. The goal is to transform 
both system and flag so that a measurement on the final 
flag state would reveal that the system is in state er with 
probability p and some other state with probability 1 —p. 
More concretely, one would be interested in maximizing 
the value of p in the transformation: 


p®|0)(0| ^>p=pa® |O)(O| + (l-p)X0|l)(l|. (70) 

Due to the results in Section lHI Bl it is clear that the max¬ 
imum value of p achievable in the heralded case Eq. ( TOJ) , 
is at least at large as p* in the unheralded case Eq. ([2 ) for 
block-diagonal a. In follow-up work to the initial version 
of this manuscript the converse was proven and thus 
the two maximum probabilities are equal. In Appendix |B| 
we extend this analysis to consider the achievable her¬ 
alded probability when a contains coherences or when 
one may use a catalyst to assist in the transformation. 

At the moment, although our results regarding maxi¬ 
mum extractable work are general, little is known about 
transitions when the final state is not block-diagonal in 
the energy eigenbasis. In such a situation, our results 
provide necessary conditions but are not sufficient. Find¬ 
ing sufficient conditions is expected to be difficult, as we 
do not know such conditions even for non-probabilistic 
transformations. For recent results on the role of co¬ 
herences in quantum thermodynamics, see for example 
[26H281132| . Nonetheless, we are able to utilize some of 
these results to provide bounds on the achievable her¬ 


alded probability when the target state contains coher¬ 
ences in energy. This is done in Appendix [B] . 

Our analysis has focused on Noisy and Thermal Oper¬ 
ations in the absence of a catalyst, i.e. an ancilla which 
is used to aid in a transition but returned in the same 
state. In Catalytic Thermal Operations , CTO, given p 
and <7, we are interested in whether there exists a state 
u) such that: 




TO 


a (g) w. 


(71) 


If such an u> exists, we say p a. There exist in- 
TO CTO 

stances where p—^a and yet p —> a. Investigating 
when such catalytic transitions exists has led to a fam¬ 
ily of second laws of thermodynamics that apply in the 
single-shot regime [25] . Having access to catalysts has the 
potential to achieve higher values of p than that defined 
by p* and it would be interesting to find an expression 
or bound for the maximum value of p in the process: 


CTO 
P —>■ 


p' = pa + (1 — p) X. 


(72) 


Note that a bound can be obtained from any non¬ 
increasing monotone of CTO, M say, that satisfies 
M {pa + (1 — p) X) > pM (a). Bounding the maximum 
transition probability under Catalytic Thermal Opera¬ 
tions is made more difficult by the fact that the gener¬ 
alized free energies found in [25] are not concave. How¬ 
ever, for the case of heralded probability, the situation is 
somewhat easier and in Appendix [B] we completely char¬ 
acterize what is achievable under CTO when the target 
state is block-diagonal in energy. 

Another avenue of research is to generalize our result 
to the case where one is interested in not only maximiz¬ 
ing the probability of obtaining a single state, but rather, 
finding the probability simplex of going to an ensemble of 
many states. Again, the fact that the monotones used in 
thermodynamics are not in general concave, means that 
straight application of the techniques used in entangle¬ 
ment theory [59] cannot be immediately applied. 

Finally, by supplying more work or demanding that 
extra work is extracted, the value of p* achieved can be 
raised or lowered. For W < 0, one could calculate p* (as 
a function of W) for the states p (g> |IF)(IF| and eng) |0)(0|. 
For W > 0 the states to consider would be p® |0)(0| and 
eng) | W) {W |. What is the tradeoff between p* and IF? As 
an example, the solution for qubit systems in the Noisy 
Operations framework is given in Appendix |C| 

This work has focused on the probability with which 
a given state can fluctuate into another under a ther¬ 
modynamical process. The term fluctuation is usually 
applied within thermodynamics to the concept of fluc¬ 
tuating work, a notion most famously captured by the 
Jarzynski equality and Crooks’ theorem. These were de¬ 
rived under the framework of stochastic thermodynam¬ 
ics while our research was based on applying ideas from 
quantum information theory. Finding common ground 
between the two paradigms is likely to be beneficial to 
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both fields and links between work-based fluctuation the¬ 
orems and the resource theory operation have been de¬ 
veloped in mm- In work related to this paper [36] , we 
shall strengthen these connections still further, formu¬ 
lating the idea of fluctuating work within the resource 
theory approach and providing new insight into the asso¬ 
ciated fluctuation theorems. What is more, we shall find 
fully quantum generalizations and see how the 2nd law 
of thermodynamics can be recast as an equality. 
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Appendix A: Entanglement cost of transformations under LOCC 

The monotones that we have used for studying Noisy Operations, have been, or can be, defined solely in terms of 
Lorenz curves. They are also monotones in the resource theory of bipartite pure state entanglement manipulation 
under Local Operations and Classical Communication |62 (I63I . where such curves can also be constructed. Using our 
monotones, and the behavior of Lorenz curves under tensor product with certain states, we give an expression for the 
single-shot entanglement of transition. This is the amount of entanglement that must be added (or can be extracted) 
in transforming |Tab) into |$ns) under LOCC. 

Previous work has considered the distillable entanglement and entanglement cost - the entanglement of transition 
when one of I'Lab) or |Tab)j respectively, is taken to be a separable state. In |49j . the amount of entanglement that 
can be distilled from a single copy of a bipartite mixed state, <jab > was bounded in terms of the coherent information. 
For a bipartite pure state, IT^s), it is given precisely by the min-entropy of the reduced state tr B I’PabX’FabI [50] - 
The amount of entanglement required to create a single copy of ctab was calculated in m in terms of the conditional 
zero-Renyi entropy. In each paper, the analysis extends to accomplishing the task up to fixed error, e. Here we go 
beyond the distillation and cost, showing that the more general entanglement of transition between two arbitrary 
pure bipartite states, can be quantified in terms of the monotones L y . 

For a bipartite pure state, |\P), on a system AB, let: 

p w =tr B |T)<T|. (Al) 

Without access to any additional resources, it is possible for two separated parties to transform IT) into another 
bipartite state, |T), under LOCC if and only if majorizes p|^) [53]. Hence if |T) can be transformed into |$): 

Vi (p|$)) > Vi (p|^r)) , VZ, (A2) 

and: 

L y (P|$)) < L y (p w ) , Vy e V (p w ) , (A3) 

where the functions Vi, L y and the set V are defined as per Section[TTJ Note that for LOCC we consider the ‘elbows’ of 
the Lorenz curve associated with the initial state whilst for NO we consider the ‘elbows’ of the final state’s curve when 
determining if a transition is possible. This change occurs as for a transition to take place in pure state entanglement 
theory, we require that the final state majorizes the initial state whilst in the theory of NO, we require that the initial 
state majorizes the final. 

The unit for quantifying entanglement costs is the ebit - the maximally entangled state with local dimension 2. 
The maximally entangled state with local dimension d: 


m = 4x»^> ( A4 ) 

V 2 — 0 

requires the two parties to share logd ebits to prepare it and they can extract logd shared ebits if they share one. 
Separable states are free within this resource theory so if we define: 

IsePd) = | 0 ) a | 0 ) b , (A5) 

as a separable pure state with local dimension d , |sep d ) costs 0 ebits to prepare and no shared entanglement can be 
extracted from it. Note that: 


Ly (pi^)®| ed)) — L y (p|^r)) , (A6) 

Ly (P|^)®|sep d )) = ^ Ly (p|'I')) • (A7) 

The entanglement of transition, is the optimal amount of shared, bipartite entanglement that the parties 

need to add, or can gain, to transform a copy of |T) into |$) under LOCC. If the quantity is negative, entanglement 

must be used up to make the transition possible while if it is positive, entanglement can be extracted. _s-|^> 

the maximum value of ulogd 2 — ulogdi that can be achieved where u,v,di,d 2 £ Z are such that: 

|T)|e dl )®«|sep d2 >®* L ^|$)|e d2 nse Pdl >®“. 


(AS) 
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In terms of Lorenz curves, the addition of entangled and separable state serve to rescale (with respect to the :r-axis) 
the curves associated with |\I>) and |d>) by d 2 ~ v and d\~ u respectively. To maximize E the Lorenz curve of 
the rescaled T) needs to lie just to the right of the Lorenz curve of the rescaled |d>). Hence: 


-v^L y (p|$)) > -v^Ly (p|$)) , Vy G V (p|®)) , 

W2 a i 


(A9) 


with equality for some y. This gives: 


7 U 

= 1 = 


max 


Ly (p\ 


<S>)) 


d2 yec(p|4>>) L y (p|<i>)) 


(A10) 


in analogy with Lemma [2] for the work of transition in Noisy Operations. 

This can be generalized to consider situations where we require only that the final state is e-close to the target state 
d> with respect to a measure such as the squared fidelity, F 2 (|d>'), |<f>)) = |(tH'|<£•)| 2 . Let: 


6 e (|$)) = {|$') : K<i>'|d>)| 2 >l-e}. 


(All) 


Then, defining El 


W-H$> 


by: 


1 ; 1 ' |$')eb E (|$>) 1 ' 1 ' 


(A12) 


we can write: 


E, 


= max <-log 


max 


L y (p| 


3>')J 


XJU.GL./V , s 

J/£®(p|®)) Ly (P|4-)J 


(A13) 
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Appendix B: Heralded probability 

In this work we have considered the optimization of p in the process: 

P p' = pa + (1 - p) X, (Bl) 

for given p and a. Another related notion of a probabilistic transformation is that of heralded probability , i.e. a con¬ 
clusive fluctuation to a state. In this setup, a qubit flag system with trivial Hamiltonian Hp oc I is incorporated which 
starts in the state |0) and after the Thermal Operation, indicates whether the system was successfully transformed 
into a. More concretely, with respect to heralded probability and for given p and a one would attempt to maximize 
p in the process: 


TO 


p = pa® |0}(0| + (1 -p) X® |1)(1|, 


(B2) 


where the total Hamiltonian is H = Hg + Hp. A measurement on the flag will result in the system being in state a 
with probability p and state X with probability 1 — p. 

When a is block-diagonal in the energy eigen basis , the measurement strategy given in Secti on |III B| can be used to 
convert a protocol obtaining a value of p in Eq. (Bl) into one that obtains a value of p in Eq. (B2). Indeed, since our 


initial manuscript, it has been shown that the maximum value of p that can be achieved in both scenarios for such a 
is identical [58]. 

However, analyzing the optimization of p in Eq. (B21 is more tractable than the equivalent problem with respect 


to Eq. (Bl) as for the problem of heralded probability we may always take X = t $, the thermal state of the system. 


To see this, assume that we start with the state-Hamiltonian pair: 


{pa ® |0)(0| + (1 - p) X ® |1)(1|, H s + H f ) , 


(B3) 


and then apply the following Thermal Operations: 

1. Append a thermal state with Hamiltonian Hp = Hs'. 

{pa®\0)(0\ + (l-p)X®\l)(l\,H s + H F ) 

—A {pa ® t b ® |0)(0| + (1 - p) X ® t b ® |l)(l|,Hs + H b + Hp). 


2. Apply the unitary U = Igs ® |0)(0| + U s ^ v ® 11)<11 where U B ^ ap is the unitary that swaps the state of the 
system with the state of the bath. As Hs = H B , [U,Hg + H B + Hp] = 0 and hence U is a valid Thermal 
Operation. This implements: 

{pa ® t b ® |0)(0| + (1 -p)X ®t b ® |1)(1|,F s + H b + H f ) 

-^> {pa ®t b ® |0)(0| + (1 - p) t s ® X ® |1)(1|, Hs + H b + Hp). 


3. Discard the bath system: 


{pa ® t b ® |0)(0| + (1 — p) ts ® X ® 11)(1|, Hs + H B + Hp) 
{pa ® |0)(0| + (1 -p)T S ® |l)(l|,Hs + Hp). 


Hence, given a state of the form p, we can always find a Thermal Operation that converts X into ts- In attempting 
to maximize p in Eq. (B2) we can thus always assume that X is the thermal state of the system. This simplification 
will enable us to prove additional bounds on the maximum value of the heralded probability, p, for Catalytic Thermal 
Operations and the case where a contains coherences in energy. 


1. Heralded probability with catalysts 

In Catalytic Thermal Operations, given p and <r, we are interested in whether there exists a state w such that: 


p®w 


TO 


a ®ui. 


(B4) 
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CTO 

If such an w exists, we say it catalyzes the transformation and write p —> a. Determining whether such an oj exists 
has resulted in a family of second laws of thermodynamics [25]. 

Defining the generalized free energies of (p, H$) by: 

F a (pllrs) = kTD a (p||rs) - kT\ogZ s , (B5) 

where D a are the Renyi divergences given by: 

D a (p\\ t s ) = S -f^- log tr , (B6) 


CTO 

then for block-diagonal a, p — > a if and only if F a (pd\\t~s) > F a (<t||ts), holds Va >0. If a is not block-diagonal, 
then by replacing a with erp in these expressions we obtain conditions that are necessary but not sufficient. 

To optimize the heralded probability of a transformation from p to a under Catalytic Thermal Operations, we thus 
want to maximize the value of p in p = pa (g) |0)(0| + (1 — p) t$ ® 11)(11 subject to these free energy constraints applied 
to p and p. This gives us: 


p < maxjp : F a (p D <g> |0)(0| t s <g> I 2 ) > F a (pa D (g) |0) (0| + (1 - p) t s ® |1) (1| t s (S> I 2 ) , aG[0,oo]|. (B7) 


Furthermore, when a is block diagonal in the energy eigenbasis, this bound on p is achievable as the second laws 
imply there exists an ui such that: 


p ® |0)(0| (g) u> (pa ® |0)(0| + (1 - p) t s ® |1)(1|) 


(B8) 


2. Heralded probability for arbitrary quantum states 


Quantum generalizations of the Renyi divergences have also been used to construct constraints on coherence ma¬ 
nipulation under Thermal Operations. Specifically, if we define the free coherence of a state p by: 


A a (p) = S a (p\\p D ), 


(B9) 


where S a are the quantum Renyi divergences given by: 


S a (p\\pd) = 


^logtr [pPp], “] , 
tr [p (logp — \ogp D )}, 


S^logtr 


' 1 — q 1 -q \ a 

Pd“ PPD a ) . 


a G [0,1), 
a = 1, 

a > 1, 


(BIO) 


TO 

then it was shown in m that for general a, p —> a only if A a (p) > A a (a) for all a > 0. 

Using this we obtain the following bound on the maximum heralded probability of a transformation from p to a 
under Thermal Operations: 


p < max-jj? : A a (p® |0)(0|) > A a (per (g) |0)(0| + (1 -p)rs <8> |1)(1|), a £ [0, 00 ]} . 


(Bll) 
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Appendix C: The tradeoff between probability and work of transition for a qubit under Noisy Operations 


In this appendix we consider how p* varies if we supply additional work when attempting to convert p into a. 
Alternatively we could attempt to extract extra work during the process. Whilst characterizing the behavior of p* in 
general is an open question, here we give the solution for qubit systems with trivial Hamiltonian. 

Consider two qubits: p with ordered eigenvalues ff = {771 , 772 } and a with ordered eigenvalues £ = {Cl, C2}- For the 
transition: 


p®s\ w \ —A p' = pa + (1 - p) X, if W < 0, 

p p' = pa® S|w| + (1 - p) X , if W > 0, 


(Cl) 


p* (0) is given by min 1 j. For W < W p -^ a , 


how does p* behave as a function of W? Note that using Theorem 
by definition we have that p* (W) = 1 (as for these values of W, the transition can be performed deterministically). 
So as to investigate the behavior of the function at W = 0, in what follows we shall assume 771 < £1 and hence 


W p ^ a < 0. 

First take W < 0 and for simplicity, assume it can be written as W = — log j. Then: 


„ (m m m m „ n \ 

p®s\ w \ = diag I , 0 , ..., 0 , 

\J J J J J 


a ® - = diag 
a 



(C2) 


(C3) 


We now use Theorem 1 together with the fact that p* (W) will occur at an ‘elbow’ of a (which is equivalent to a ® g 
under Noisy Operations). As W p ^ a < W, and the transition does not happen with certainty, we need to only consider 
the elbow l = d in Theorem □ Thus: 


p* ( W ) = 


V d (p®s\ w \) _ r li + 
V d (a ® i) 


ir r i2 

J w, 


This can be rearranged to give: 


p* (W) = (2-2 ~ w )p* (0) + 


Ci 


2~ w — 1 

Cl ’ 

. d 


p—tcr 


< — log - < 0 . 
J 


w p ^ < w < 0 . 


Now take W > 0 and assume it can be written as W = log . Then: 


„ I fm Vi m m \ 

o®3“M7 7-7 7F 


cr®s\ w \ =diag(" —^,0, ...,0). 

J J J 1 


(C4) 

(C5) 

(C 6 ) 

(C7) 


3 3 2 (d-j) 

There are two ‘elbows’ on a ® S|q/|, at l = j and l = 2 j. Calculating the ratio of the monotones at these points gives: 



jl Zi 

J d 

Vl O-W 

Vj (cr (g) S|^|) 

Cl 

Ci 

V 2 j(P®d) 

r 


V 2 j (a® S|iy|) 

{m 

+ 2j d d m 


-)-W 


if 2 j < d, 


->-w 


(C 8 ) 

(C9) 


It is easy to see that (F < 2771 since Cl > 5 • Comparing Eq. (C 8 ) with the second case in Eq. (C91, it is possible to 
show that: 

Mp® l d ) < V2 3 {p®1) > ?? i-2Ci+2i ?1 Ci 


Vj(a®s\ w \) V 2 j(a®s \ w \) 


2 i?iCi — Ci 


(CIO) 
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P*(W) 



Wp-nr 12 3 4 


W 


FIG. 8. Here we show how p*(W ) varies as a function of IF for qubits under Noisy Operations when W p ^, a < 0. Note the 
behavior at W = 0, indicating the function is not convex in IF > W p ^ a . 


As W > 0, the minimum ratio occurs at l = j. Hence: 

p* (IF) = p* (0) 2~ w , W>0. 

Combining these results, we have that for rji < Ci : 

[ 1 if IF < W p ^ a , 

P*(W)= l (2 ~ 2~ w ) p* (0) + if Wp-j.tr < IF < 0, 

[p* (0) 2~ w if 0 < IF. 

As an example, in Figure [8j we plot p* (IF) against IF for fj = {0.6, 0.4} and (f = {0.85, 0.15}. 
For completeness, for rft > Ci: 


p* (IF) = 


(27?! - 1) + 2 (1 - m) 2" W 

m n-w 
Ci 


if IF < IFp-^o-, 

ifW p ^ g <IF<log( % 2 ^ i + _y i ), 

ifiF>iog( %^ i + _y ). 


(Cll) 


(C12) 


(C13) 







